motivating application
saliency maps
captum package
India lacks centralized renewable energy databases.
Questions: - Solar plant counts over time? - Geographic distribution?
Needed for energy transition planning r Citep(bib, c("Ortiz2022", "solar_data")).
Question. Can we locate solar farms in satellite imagery?
Data. Sentinel 2 (10-60m/px, 12 bands) + OpenStreetMap labels. 1363 labeled sites.
Constraint. Must isolate pixels responsible for predictions.
Models can pick up on spurious correlations (like whether there is snow in the background). Leads to generalization failure.
Labels. \(y \in \{0, 1\}^{256 \times 256}\) (pixel labels)
Predictors. \(x \in \mathbb{R}^{10 \times 256 \times 256}\) (spectral patch)
Model. \(f(x; \theta) \to [0, 1]^{256 \times 256}\) (probability mask)
Goal. Find attributions \(\varphi \in \mathbb{R}^{256 \times 256}\) explaining \(f(x)\).
Perturb input \(x\) to measure prediction changes.
Occlusion. Mask patches, observe \(\Delta f(x)\)
Feature Ablation. Replace features with baseline, measure marginal effects
\[E_{\text{grad}}(x) = \frac{\partial S_c(x)}{\partial x}\]
Identifies pixels where small changes maximally affect class \(c\) score.
Refinement: \(x \odot \frac{\partial S_c}{\partial x}\) accounts for feature scale.
torch.autograd.Saturation. Flat regions (ReLU) yield zero gradients despite importance.
Noise. High-frequency artifacts from shattered gradients.
Instability. Prediction-stable regions show large gradient changes.
Satisfies completeness: \(\sum \text{IG}_i(x) = f(x) - f(x')\)
\[\text{IG}_{i}(x) = \left(x_{i} - x_{i}'\right) \int_{\alpha=0}^{1} \frac{\partial f\left(x' + \alpha\left(x - x'\right)\right)}{\partial x_{i}} d\alpha\]
\(x'\): baseline (e.g., black image)
Aggregates gradients along path \(x' \to x\).
Riemann sum with \(m \in [50, 200]\) steps:
\[\text{IG}_i(x) \approx (x_i - x'_i) \sum_{k=1}^m \frac{\partial f(x' + \frac{k}{m}(x - x'))}{\partial x_i} \cdot \frac{1}{m}\]
Cost: \(\mathcal{O}(m)\) backpropagations.
PyTorch interpretability library:
captum.attr: Saliency, IG, DeepLift, SHAPcaptum.metrics: Robustnesscaptum.concept: TCAV (next week)Goal: Write the loop for Integrated Gradients.
Initialize total_gradients = 0.
Define baseline and steps = 50.
For k in range 1 to steps:
### ???
attribution = (input - baseline) * (total_gradients / steps)
SHAP treats pixels as the players in a cooperative game.
\[\varphi_i = \sum_{S \subseteq N \setminus \{i\}} \frac{|S|!(|N|-|S|-1)!}{|N|!} [v(S \cup \{i\}) - v(S)]\]
Problem: \(256 \times 256\) image has \(2^{65536}\) coalitions.
Approximations: - KernelSHAP: weighted regression - Superpixels: group pixels to reduce \(|N|\)
Captum provides KernelShap and ShapleyValueSampling.
Another idea is to restrict the collection of sets in the summation.
This is most natural when there is a notion of distance between features. For example, for a word at the start of a sentence, don’t bother with sets of words near the end.

Let \(N_k(i)\) = features within distance \(k\) of \(i\).
\[\varphi^{L}(f, i) = \frac{1}{|N_k(i)|} \sum_{S \in N_k(i)} \frac{1}{\binom{|N_k(i)|-1}{|S|-1}} [v(S) - v(S \setminus \{i\})]\]
Efficient when \(k \ll |N|\).
